On the Comparison Complexity of the String Prefix-Matching Problem

نویسندگان

  • Dany Breslauer
  • Livio Colussi
  • Laura Toniolo
چکیده

In this paper we study the exact comparison complexity of the string prefix-matching problem in the deterministic sequential comparison model with equality tests. We derive almost tight lower and upper bounds on the number of symbol comparisons required in the worst case by on-line prefix-matching algorithms for any fixed pattern and variable text. Unlike previous results on the comparison complexity of string-matching and prefix-matching algorithms, our bounds are almost tight for any particular pattern. We also consider the special case where the pattern and the text are the same string. This problem, which we call the string self-prefix problem, is similar to the pattern preprocessing step of the Knuth-Morris-Pratt stringmatching algorithm that is used in several comparison efficient stringmatching and prefix-matching algorithms, including in our new algorithm. We obtain roughly tight lower and upper bounds on the number of symbol comparisons required in the worst case by on-line self-prefix algorithms. Our algorithms can be implemented in linear time and space in the standard uniform-cost random-access-machine model. ∗BRICS – Basic Research in Computer Science, Centre of the Danish National Research Foundation, Department of Computer Science, University of Aarhus, DK-8000 Aarhus C, Denmark. Partially supported by the ESPRIT Basic Research Action Program of the EC under contract #7141 (ALCOM II). Part of the research reported in the paper was carried out while this author was visiting at the Istituto di Elaborazione dell’Informazione, Consiglio Nazionale delle Ricerche, Pisa, Italy, with the support of the European Research Consortium for Informatics and Mathematics postdoctoral fellowship. †Dipartimento di Matematica Pura ed Applicata, Università di Padova, Via Belzoni 7, I-35131 Padova, Italy. Partially supported by “Progetto Finalizzato Sistemi Informatici e Calcolo Parallelo” of the Italian National Research Councile under grant number 89.00026.69. ‡Dipartimento di Matematica Pura ed Applicata, Università di Padova, Via Belzoni 7, I-35131 Padova, Italy. Parts of the research reported in this paper were carried out while this author was visiting at the Institut Gaspard Monge, Université de Marne-la-Vallée, Noisy-leGrand, France, supported by “Borsa di studi per attività di perfeziomento all’estero” from the University of Padua, and at BRICS, Department of Computer Science, University of Aarhus, Aarhus, Denmark, supported by the Gini Foundation of Padua.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Tight Comparison Bounds for the String Prefix-Matching Problem

In the string preex-matching problem one is interested in nding the longest preex of a pattern string of length m that occurs starting at each position of a text string of length n. This is a natural generalization of the string matching problem where only occurrences of the whole pattern are sought. The Knuth-Morris-Pratt string matching algorithm can be easily adapted to solve the string pree...

متن کامل

Time Complexity of Knuth-Morris-Pratt String Matching Algorithm

This project centers on the evaluation for the time complexity of Knuth-Morris-Pratt(KMP) string matching algorithm. String matching problem is to locate a pattern string within a larger string. The best performance in terms of asymptotic time complexity is currently linear, given by the KMP algorithm. In this algorithm, firstly a prefix for the pattern string is computed and then based on this...

متن کامل

Crochemore's String Matching Algorithm: Simplification, Extensions, Applications

We address the problem of string matching in the special case where the pattern is very long. First, constant extra space algorithms are desirable with long patterns, and we describe a simplified version of Crochemore’s algorithm retaining its linear time complexity and constant extra space usage. Second, long patterns are unlikely to occur in the text at all. Thus we define a generalization of...

متن کامل

On the computational complexity of finding a minimal basis for the guess and determine attack

Guess-and-determine attack is one of the general attacks on stream ciphers. It is a common cryptanalysis tool for evaluating security of stream ciphers. The effectiveness of this attack is based on the number of unknown bits which will be guessed by the attacker to break the cryptosystem. In this work, we present a relation between the minimum numbers of the guessed bits and uniquely restricted...

متن کامل

Centralized Clustering Method To Increase Accuracy In Ontology Matching Systems

Ontology is the main infrastructure of the Semantic Web which provides facilities for integration, searching and sharing of information on the web. Development of ontologies as the basis of semantic web and their heterogeneities have led to the existence of ontology matching. By emerging large-scale ontologies in real domain, the ontology matching systems faced with some problem like memory con...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:
  • J. Algorithms

دوره 29  شماره 

صفحات  -

تاریخ انتشار 1998